Skip to content

Add legal artifact presets, FOSSA-compatible outputs#199

Open
lelia wants to merge 37 commits into
mainfrom
lelia/add-legal-checks
Open

Add legal artifact presets, FOSSA-compatible outputs#199
lelia wants to merge 37 commits into
mainfrom
lelia/add-legal-checks

Conversation

@lelia
Copy link
Copy Markdown
Contributor

@lelia lelia commented May 11, 2026

Summary

Introduces a compliance-oriented --legal workflow to socketcli and an opt-in --legal-format fossa mode for producing FOSSA-compatible artifact shapes.

Changes

The --legal workflow enables license generation and default artifact output for:

  • socket-report.json
  • socket-summary.txt
  • socket-report-link.txt
  • socket-sbom.json
  • socket-license.json

The new --legal-format fossa mode adapts those outputs to match the structural shapes the real FOSSA CLI emits — captured from a UiPath Azure DevOps FOSSA pipeline as reference (CE-199):

  • fossa-analyze.json — the composed wrapper FOSSA pipelines actually produce: {project, vulnerability[], licensing[], quality[]}. The project sub-object is the 6-key fossa analyze --json shape with id formatted as <projectLocator>$<revision>. vulnerability[] items follow the /api/v2/issues shape (28 fields including source, depths, statuses, projects[], remediation, metrics, epss, etc.).
  • fossa-sbom.json — the fossa report --json attribution shape: 5 top-level keys (copyrightsByLicense, deepDependencies, directDependencies, licenses, project). Per-Dependency entries are the 14-key FOSSA attribution shape, with attribution text sourced from Package.licenseAttrib[].attribText, direct/deep partitioning by Package.direct, and dependencyPaths as <ancestor> > <package> chains computed from topLevelAncestors.
  • FOSSA-like default filenames: fossa-analyze.json, fossa-test.txt, fossa-link.txt, fossa-sbom.json. The Socket-side --sbom-file slot is suppressed in fossa mode (the FOSSA "SBOM" artifact is the attribution payload).
  • Consistent JSON formatting: both JSON outputs written with indent=2.

Adds

  • Explicit file output support for JSON reports, summary text, and report links
  • Hardened legal artifact generation for sparse scan paths so artifact creation completes safely even when SBOM/package data is incomplete

Documented gaps

Fields with no Socket data source are emitted as consistent documented defaults (see module docstring at top of socketsecurity/fossa_compat.py). Examples: vulnerability[].epss, cvssVector, exploitability, cveStatus, published, customRiskScore, project timestamps, semver-distance labels; per-dependency description, downloadUrl, projectUrl, hash, isGolang, notes, otherLicenses; top-level copyrightsByLicense and licenses body-text map. partialFix and completeFix collapse to the same value since Socket has only one fix-version concept.

Testing

  • Unit test coverage for:
    • --legal and --legal-format defaults
    • FOSSA-compatible analyze report shape (top-level keysets, vulnerability/licensing/quality item shapes)
    • FOSSA attribution shape (5 top-level keys, 14-field per-Dependency entries, direct/deep partitioning, dependency-path computation, license attribution sourcing with fallback chain)
    • Sparse-data scenarios
  • Structural parity tests (tests/unit/test_fossa_parity.py) that load real FOSSA artifacts captured from the UiPath pipeline (committed to tests/fixtures/fossa/) and assert our builder output's keysets match at every level (top-level, project, dependency). These guard against future drift from FOSSA's actual shape.
  • 232 unit tests passing.

Test plan

  • Unit tests pass (uv run pytest tests/)
  • Structural parity tests assert keyset equality against real FOSSA fixtures
  • Manual end-to-end against a real Socket scan with --legal-format fossa and confirm outputs satisfy the customer's validation pipeline gate (file exists, non-empty, parseable JSON for the two JSON files)

@lelia lelia requested a review from Douglas (dacoburn) May 11, 2026 16:13
Signed-off-by: lelia <[email protected]>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 11, 2026

🚀 Preview package published!

Install with:

pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple socketsecurity==2.2.91.dev1

Docker image: socketdev/cli:pr-199

@lelia lelia changed the title [DRAFT] Simplify compliance workflow with legal preset and artifacts Add legal artifact presets, FOSSA-compatible outputs May 18, 2026
@lelia lelia marked this pull request as ready for review May 18, 2026 18:55
@lelia lelia requested a review from a team as a code owner May 18, 2026 18:55
lelia and others added 6 commits May 21, 2026 15:49
Adds customRiskScore: None to vulnerability entries (FOSSA samples
include this field, sometimes null). Documents all gap fields and their
defaults in the module docstring. Locks the new key in EXPECTED_VULNERABILITY_KEYS.
Replaces the 2-key {project, dependencies} shape with the real FOSSA
attribution shape: copyrightsByLicense, deepDependencies,
directDependencies, licenses, project.

The SBOM project field is now the 2-key {name, revision} subset rather
than the 6-key analyze project shape. _partition_dependencies is a stub
returning ([], []) until Tasks 7-9 fill in per-dependency entries.
Add _build_dependency_entry and _build_dependency_licenses to produce
the 14-key per-dependency dict that matches real FOSSA attribution
output. License entries prefer licenseAttrib (full attribText + spdxExpr),
fall back to declared license string, or emit [] when unlicensed.

Also removes the stale test_fossa_attribution_payload_shape_is_stable
test, which asserted the pre-Task-6 two-key shape and was already
failing.
Replaces the stub that always returned [package.name] with real logic:
direct deps emit just their name; transitive deps emit one
"<ancestor> > <package>" chain per top-level ancestor, falling back to
name-only when ancestors are absent or not in the lookup.
Pin project.id to dollar separator, replace 2-key SBOM with 5-key
shape, and update per-dependency assertions to the 14-key
_build_dependency_entry contract.
@lelia
Copy link
Copy Markdown
Contributor Author

lelia commented May 27, 2026

Eric Hibbs (@flowstate) review notes:

  • resolve the existing merge conflicts (pyproject.toml, uv.lock, socketsecurity/__init__.py)
  • sanitize any customer references/artifacts (eg. tests/fixtures/fossa/README.md‎)
  • the --legal-format fossa format only includes new_alerts (plus unchanged_alerts with --strict-blocking enabled) which could under-represent full findings compared to the typical FOSSA pipeline
    • affected areas: _iter_selected_issues(), build_fossa_report_payload())
    • this could create a parity risk for FOSSA users expecting project-wide issue sets
  • README phrasing doesn't match the FOSSA SBOM shape - the code uses directDependencies / deepDependencies instead of a single dependencies key (L159 of README)
    • affected areas: fossa_compat.py and test_fossa_parity.py
    • this could cause confusion for users integrating against the schema

FOSSA's /api/v2/issues endpoint returns a point-in-time snapshot of all
issues at the scan revision, not only diff-new ones. The previous
implementation only included unchanged alerts when --strict-blocking
was set, causing FOSSA-mode output to under-represent project-wide
findings compared to the typical FOSSA pipeline.
Replace customer org ID and project name with generic placeholders
(1234/example-validation-project) across all four fixtures and the README.
Structural shape, key sets, value types, and per-field cardinality are
unchanged. Parity tests assert keysets only, so the substitution is
transparent to test behavior.
The SBOM artifact now matches FOSSA's `report --json attribution`
shape with five top-level keys, not the previously documented
`project` / `dependencies` two-key payload.
- pyproject.toml + socketsecurity/__init__.py: keep this branch's version
  (2.2.91, bumped for the legal-artifacts release)
- uv.lock: regenerated via `uv lock` after merge (don't hand-merge lockfiles)
@flowstate
Copy link
Copy Markdown
Contributor

lelia thanks for the review. Addressed all four points in commits ff9edd17444463:

  • Merge conflicts — merged main in (7444463). Kept this branch's version (2.2.91) in pyproject.toml and socketsecurity/__init__.py; regenerated uv.lock via uv lock rather than hand-merging.
  • Customer references — sanitized the fixture JSONs and README (fee62de). All four fixtures now use 1234/example-validation-project instead of the captured customer identifiers. Structural shape and key sets are unchanged; the parity tests assert keysets only so this is transparent to them. Confirmed no DevTools-Validation-Pipeline, 6060/DevTools, or saml/6060 markers remain.
  • _iter_selected_issues parity risk — fixed (ff9edd1). The strict_blocking gate is removed; unchanged_alerts always flows into the FOSSA payload, matching /api/v2/issues?...&scope[revision]=...'s point-in-time-snapshot semantics. Added a regression test asserting both strict_blocking=True and =False produce the same CVE set in FOSSA mode. The data was already on diff_report.unchanged_alerts so no SDK changes needed.
  • README phrasing — updated L159 (aa472cb) from the old project / dependencies two-key shape to the actual five-key copyrightsByLicense / deepDependencies / directDependencies / licenses / project shape.

234 unit tests passing on the merged branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants